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FOREWORD 


This  report  describes  an  artificial  neural  network  (ANN)  designed  to  recover  an  arbitrary 
transformation  that  relates  two  images.  This  ANN  computes  local  gradients  between  the 
boundaries  of  the  image  and  its  transformed  copy.  Using  the  Gaussian  average  of  these  gradients, 
the  netwoik  acts  to  reverse  the  action  of  the  transformation.  The  transformation  relating  the  two 
images  is  thus  recovered  in  this  manner.  ^ 
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ABSTRACT 


Given  an  object  and  a  copy  of  itself  produced  by  an  unknown  two-dimensional  affine 
transformation,  a  new  neural  network  architecture  has  been  developed  that  recovers  this 
transformation  by  minimizing  the  symmetric  difference  between  the  object  and  the  copy.  This 
architecture  performs  a  gradient  descent  in  symmetric  difference  error  space  and  is  designated  as 
visual  gradient  descent  (VGD).  The  VGD  network  has  applications  to  both  two-  and  three- 
dimensional  model  based  automatic  target  recognition  (ATR)  and  image  compression  using  iterated 
function  systems. 
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NOMENCLATURE 

The  intensity  value  at  pixel  (i,  j).  For  monochrome  images  pij  =  0  or  1 
The  symmetric  difference  errOT  measure 

A  transformation  on  consisting  of  a  rotation  and  a  2  degree  of  freedom 

(DoF)  translation 

A  transformation  on  consisting  of  a  3DoF  rotation  and  a  2DoF  translation 
A  projection  mapping  from  R^  into  R^ 

An  affine  transformation  on  R^ 

The  integrated  image  intensity  at  pixel  (i,  j)  for  orientation  numbe*  n 

The  local  gradient  at  pixel  (i,  j)  for  orientation  number  n  and  time  t 

The  Gaussian  gradient  at  pixel  (iXjX )  for  orientation  n  and  time  t 

The  correction  vectcr  for  point  number  X,  X?=1..3,  at  time  t 

The  affine  transformation  that  takes  point  pX  to  point  pX,',  X^1..3,  at  time  t 

The  Lebesgue  measure  of  the  left  side  of  the  one-dimensional  simple  cell 
residing  in  the  set  I 

The  Lebesgue  measure  of  the  right  side  of  the  one-dimensional  simple  cell 
residing  in  the  set  I 
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INTRODUCTION 


Many  problems  of  interest  involve  recovering  a  transformation  that  relates  an  object  and  a 
distorted  copy  of  itself.  This  task  occurs  in  model-based  automatic  target  recognition  (ATR)  and  in 
image  compression  using  iterated  function  systems.  In  the  ATR  problem,  to  properly  identify  a 
target  image,  a  set  of  reference  models  needs  to  be  maximally  aligned  with  a  target  image  that  has 
been  subjected  to  a  unknown  transformation.  In  the  case  of  iterated  function  systems  (IPS),  an 
image  is  covered  with  copies  of  itself  obtained  using  affine  transformations.  The  contractive  affine 
transformations  that  achieve  maximal  overlap  between  the  image  and  the  tmion  of  the  copies  must 
be  recovered  in  order  to  define  the  IPS. 

A  new  neural  network  architecture,  termed  visual  gradient  descent  (VGD),  has  been 
developed  that  recovers  an  unknown  transformation  by  computing  local  symmetric  difference 
gradients  between  a  reference  object  and  its  transformed  copy.  The  VGD  network  solves  for  the 
transformation  that  minimizes  the  global  symmetric  difference  between  the  object  and  its  copy. 
This  process  can  be  viewed  as  a  viable  collective  computation  alternative  to  the  standard  global 
gradient  descent  technique. 

This  report  begins  with  a  description  of  the  general  object  comparison  problem,  with  an 
emphasis  on  how  various  types  of  application  can  be  addressed  by  specifying  the  form  of  the 
transformation  relating  the  two  objects.  Next,  some  general  background  on  neural  network 
approaches  to  vision  is  provided,  followed  by  a  detailed  description  of  the  VGD  network  when  the 
two  objects  are  related  by  an  affine  transformation.  Pollowing  this,  a  theoretical  analysis  is 
performed  for  the  case  of  a  one-dimensional  affine  transformation.  The  report  concludes  with 
some  computer  simulations  that  recover  the  affine  transformation  relating  two  squares  and  a 
discussion  of  some  ongoing  tmd  future  work  that  applies  the  VGD  networic. 


GENERAL  OBJECT  COMPARISON  PROBLEM 

ERROR  MEASURES 

For  the  purposes  of  this  report,  monochrome  M  by  N  pixel  space  P  is  defined  as  {(ijJc)  I  i 
€  { 1,2, ...  ,M)  ,  j  e  { 1,2, ... ,  N),  k  6  {0,  1}}.  An  image  I  will  be  some  subset  of  pixels  with 
nonzero  k  values.  Since  any  pixel  p=(i  j,l)  that  is  part  of  an  image  has  a  nonzero  k  value,  we  will 
simplify  our  notation  and  write  p=(io),  or  indicate  this  pixel’s  k  value  by  pjj. 

Given  two  pixel  images,  a  measure  of  the  degree  of  similarity  of  the  images  is  often  needed. 
One  such  measure  that  has  the  desirable  property  of  computational  efficiency  is  the  symmetric 
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difference  measure.  The  symmetric  difference  measure  is  most  simply  defined  for  two  objects  A 
and  B  as 


Esd  =  number  of  pixels  in[A  u  B  -  (A  n  B)] 

where  A  is  the  set  of  pixels  that  comprise  object  A,  and  B  is  the  set  of  pixels  that  comprise  object  B. 
This  is  a  set  operation  that  can  be  easily  implemented  for  binary  valued  pixel  images  as  follows. 

1 .  Let  pixel  plane  a  contain  object  A,  and  pixel  plane  b  object  B. 

2.  Summing  over  all  i  and  j,  compute  the  sum 
Esd  =  2S(aij  -  bij)2 

=  Z5:(aij2-2aijbij  +  bij2) 

Let  xe  AnB  (both  set  A  and  set  B  are  on)  then  aij=l  and  bij=l  hence  Egdx*  the  contribution 
of  X  to  Esd,  ts  0 

if  x€  A\B  (set  A  is  on,  set  B  is  off)  then  aij=l  and  bij=0  hence  Esdx  =  1 
likewise  if  xe  B\A  (set  A  is  off,  set  B  is  on)  then  ayM)  and  bij=l  and  Esdx  =  1 
the  remaining  possibility  then  is  xe  (AuB)'  where  ajj=0  and  bjj=0  giving  Egdx  =  0 
so  Esd  =  ^^ttij  *  t>ij)^  =  2XEsdx)  =  number  of  pixels  in[A  u  B  -  (A  n  B)]  as  desired 


TRANSFORMATIONS  RELATING  OBJECTS 

The  application  of  interest  determines  the  transformation  that  relates  the  two  pixel  images 
that  need  to  ^  compared.  In  the  case  of  two-dimensional  ATR  where  translation,  within-plane 
rotation,  and  uniform  scaling  is  allowed,  then  the  transformed  object  or  target  image  O  is  obtained 
from  one  of  the  reference  objects  Oi  using  a  transfonn  A  of  the  form 

I  x|  I  rcos0  -rsne  1 1  x|  |  e| 

aLI  =  I  .  .  ft ''v'  ' 

1'!  I  rsinO  rcoseiiVi 

where  r  is  the  scale  factor,  0  is  the  rotadon  angle,  and  e  and  f  are  translations.  Target  identification 
is  accomplished  by  finding  which  of  our  reference  models  subjected  to  the  above  transformation 
can  be  best  aligned  with  the  target  im<.ge  (see  Figure  1). 
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Another  problem  of  interest  is  the  three-dimensional  ATR  problem  where  the  target  image 
and  model  images  have  been  projected  into  R^.  In  this  case,  the  target  image  is  obtained  from  one 
of  the  reference  images  via  the  composition  of  a  6  degree  of  freedom  (DoF)  mapping  F  with  a 
projection  mapping  n  :  r3  ^  r2  into  the  line  of  sight  plane.  F  is  defined  in  terms  of  a  scale  factor 
r,  X  translation  Ax,  y  translation  Ay,  and  the  angles  yaw  or  ^  a  rotation  about  the  z  axis,  pitch  or  0  a 
rotation  about  an  intermediary  y  axis,  and  roll  or  y  a  rotation  about  the  final  x-axis.  In  this  case,  F 
is  given  by 


r»i 

1  COS0COS<t> 

cosesiiK]) 

-sinti  |[x“| 

Tax! 

fI  yi 

-r|  sinysin0cos<)>-cos\|rsin<t) 

si  n\|rsi  nOsi  n<]>-K:osycos<{> 

cosOsiniir  ||  y  | 

J 

L^J 

1  cosysin6cos(i)+sin\(rsin(t> 

cos\|/sin  0si  n({»-si  n\|/cos(]> 

cos0cos\|r  |[_zj 

LoJ 

(Reference  1).  Figure  2  portrays  the  three-dimensional  ATR  problem. 

In  the  case  of  image  compression  using  iterated  function  systems  the  two  images  are  related 
by  an  affine  transformation  (Reference  2)  of  the  form 


r  cos8 
r  sin6 


-ssinxir  Fxl 

I  I 

scos\|r  J  [Xj 


where  r  and  s  are  scale  factors,  and  0  and  \|f  are  generalized  rotation  angles.  As  part  of  the  image 
compression  process,  an  image  is  covered  with  copies  of  itself  that  are  produced  by  affine 
transformations  (see  Figure  3).  The  overlap  between  the  image  and  each  of  the  collage  pieces  needs 
to  be  maximized  for  accurate  compression.  An  analysis  of  a  human’s  ability  to  optimally  adjust  the 
collage  pieces  led  to  the  development  of  the  VGD  network.  The  VGD  net  reconstitutes  the  affine 
transformation,  which  optimally  aligns  an  image  with  a  copy  of  itself  that  has  been  initially 
subjected  to  an  unknown  affine  transformation. 
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COORDINATE  FRAME 
Z 


TARGET  IMAGE 

o 


TARGET  IMAGE  PROJECTED  INTO  yz  PLANE 


ORIGINAL  IMAGE  WITH  COVERING  PIECES 

HGURE  3.  COLLAGE  PROCESS  ASSOCIATED  WITH  ITERATED 
FUNCTION  SYSTEM  IMAGE  COMPRESSION  PROCESS 
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VISION  BACKGROUND 


To  solve  the  generic  misalignment  problem,  one  must  be  able  to  compute  the  local 
misahgnments  between  a  reference  object  and  a  transformed  copy  of  itself  (see  Figure  4).  For  this 
discussion,  the  transformed  copy  is  obtained  from  the  reference  object  by  the  application  of  an 
affine  transformation  A(t).  Simple  processing  cells  with  a  sensitivity  to  image  gradients  across 
their  field  of  view  are  appropriate  for  this  task.  This  type  of  cell  has  appeared  previously  in  the 
literature  as  part  of  bound^  detection/completion  systems  (Reference  3). 

Taese  gradient  detection  cells  possess  the  advantages  of  being  similar  to  cells  in  the  primate 
visual  cortex  and  also  easily  implemented  in  terms  of  simple  artificial  neural  network  processors. 
By  using  an  appropriate  choice  of  the  cell  connection  template,  these  cells  may  be  adapted  for 
analog  very  large  scale  integration  (VLSI)  implementation.  The  response  of  these  cells  would  be 
recovered  in  a  manner  analagous  to  the  silicon  retina  of  Carver  Mead  (Reference  4). 


RGURE  4.  GENERIC  MISALIGNMENT  PROBLEM  IN  r2 


APPROACH 


SIMPLE  CELLS 

Each  side  of  a  given  simple  cell  may  reside  either  in  the  object  or  transformed  object  pixel 
space.  By  varying  the  location  of  the  left  and  right  side  of  the  simple  cells  between  the  object  and 
transformed  object,  pixel  images  various  types  of  configurations  can  be  detected.  A  simple  cell 
with  both  sides  residing  in  the  object  at  pixel  location  (i,j)  is  illustrated  in  Figure  5.  This  cell 
re'  wnds  with  an  activation  of  1  when 

(Lo(i  j;0)  -  Ro(i  j;0))  >  a 
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where  Lo(i  j;0)  is  the  integrated  image  intensity  in  the  l^t  side  of  this  cell  centered  at  (i  j)  with 
orientation  number  0  and  is  defined  by 


lo{x.  y)  dxdy 


Lo(i.j;0)  = 


left 


Ro  is  defined  analogously,  and  a  is  a  tolerance  parameter.  This  cell  is  tuned  to  the  type  of 
local  arrangement  of  the  object  in  pixel  space  that  is  illustrated  in  Figurf  5.  The  simple  cells  check 
for  gradients  along  the  four  orientations  0, 45, 90,  and  135  deg  (Figure  6). 


Lg(IJ;0).Rg{l,j;0).a 


200 


nCURES.  SIMPLE  CELL  PROCESSORS 
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.1 


ORIENTATION  NO.  0 
ODEG 


ORIENTATION  NO.  1 
45  DEG 


ORIENTATION  NO.  2 
90  DEG 


ORIENTATION  NO.  3 
135  DEG 


RGURE  6.  SIMPLE  CELL  ORIENTATIONS 
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By  changing  the  sign  between  the  right  and  left  terms,  one  may  vary  the  direction  of  cell 
sensitivity;  or  by  allowing  heterogeneous  cells  that  respond  to  one  sided  gradients  between  the  two 
pixel  spaces  in  the  same  areas  of  the  two  images,  misalignments  between  the  object  and  the 
transformed  object  can  be  detected.  The  fundamental  types  of  local  arrangements  require  eight 
simple  cell  types  as  indicated  in  Figures  7  and  8.  Four  types  are  homogeneous  in  that  both  cell 
sides  lie  in  the  same  pixel  space,  and  four  types  are  heterogeneous. 

FOUR  ORIENTATIONS:  I  —  /  \ 

TYPEI-e  (Lq-Rq  -  “)  TYPEIII-  ©(Ro-Lo-“) 

OBJECT  OBJECT 

0  0 

TYPE  II-  e  (L  tq-R  TO  -  “  >  TYPE  IV-  6  (R  - L  -  a  ) 

TRANSFORMED  TRANSFORMED 

OBJECT  OBJECT 

HGURE  7.  SIMPLE  PROCESSORS  (HOMOGENEOUS) 
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TYPEV-e  {L„-L^q-P) 
OBJECT 

0 

0 

TRANSFORMED 

OBJECT 

TYPEVI-  e(LTo-Lo-  P) 
OBJECT 

0 


TYPE  VII-  e(Ro  -Rjo"  PJ 
OBJECT 

0 

0 

TRANSFORMED 

OBJECT 

TYPEVm-  e  (RyQ-R  0  -  P 
OBJECT 

0 

0 


TRANSFORMED  TRANSFORMED 

OBJECT  OBJECT 

nCURE  8.  SIMPLE  HIOCESSORS  (HETEROGENOUS) 


COMPLEX  CELLS 

The  simple  cells  may  be  combined  together  to  create  complex  cells.  Based  on  the  type  of 
mismatch  between  the  object  and  the  transformed  object,  these  complex  cells  indicate  the  local 
correction  needed  in  the  transformed  objects  position.  These  complex  cells  perform  a  logical 
“and”  operation  on  the  outputs  of  the  simple  cells  that  represent  the  salient  features  of  the 
configuration  of  the  objectAransfoimed  object  pair.  If  the  complex  cell  responds  with  an  activation 
of  1 ,  this  indicates  the  ctxrection  needed  on  the  transfnmed  obj^  to  improve  its  alignment  with  the 
reference  object  (Rgure  9). 


9 


NAVSWCTR  91-609 


OBJECT 


TRANSFORMED 

OBJECT 


NEEDED  CORRECTION 
ON 

TRANSFORMED 

OBJECT 


NOTE:  Simple  Cell  Orientation  Vector  Is  K 


nGURE9.  COMPLEX  CELL  ARCHITECTURE 


For  a  given  orientation,  there  are  four  object/transformed  object  cases  and  four 
corresponding  complex  cells.  These  configurations  have  been  designed  to  respond  to 
misalignments  along  the  borders  of  the  sets.  Two  of  these  cells  indicate  a  correction  on  the 
transformed  object  in  the  k  direction,  and  two  of  these  cells  indicate  a  correction  in  the  -k  direction. 
As  indicated  in  Figure  10,  these  cells  may  be  wired  into  a  sigma  unit  in  such  a  manner  that  the  net 
output  indicates  not  only  whether  a  local  correction  along  this  orientation  is  needed,  but  also  the 
dilution  of  the  correction.  Fot  each  point  in  pixel  space,  a  value  of  +1, 0,  -1  is  assigned  for  each  of 
the  four  orientations.  These  are  called  the  local  gradients  and  are  denoted  at  time  t,  orientation  n, 
and  pixel  location  (i  j)  as  ')ftjn(0- 
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OBJECT 


TRANSFORMED 

OBJECT 


COMPLEX  CELL 
N0.1 


COMPLEX  CELL 
NO.  2 


COMPLEX  CELL 
NO.  3 


COMPLEX  CELL 
NO.  4 


Figure  1 1  portrays  the  four  local  gradient  planes  for  a  representative  sample  case  at  time 
t=0.  A  symbol  has  been  plotted  at  those  points  in  pixel  space  where  the  local  gradients  are  ntmzeio. 
As  portrayed  in  the  figure,  the  local  gradients  represent  the  the  response  of  the  complex  cells  to 
boundary  misalignments  between  the  object  and  its  transformed  copy.  As  expected,  there  is  no 
response  on  the  overlapping  interiors  of  the  two  objects  and  on  those  mismatched  boundary 
sections  that  exceed  the  ^ameter  of  the  simple  cells. 
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GAUSSIAN  AVERAGING 

The  six  parameters  that  determine  an  affine  mapping  are  uniquely  determined  by  the  action 
of  the  mapping  on  three  points.  Therefore,  three  points  residing  in  the  transformed  object  image 
must  be  chosen  in  order  to  find  the  new  affine  transformation  that  will  improve  the  overlap  between 
the  transformed  and  reference  object.  These  three  points  are  chosen  close  to  the  boundary  of  the 
transformed  object  to  make  best  use  of  the  local  gra^ents.  The  first  point  is  chosen  along  the  ray 
connecting  the  center  of  mass  of  the  transformed  object  to  the  point  of  the  set  that  is  at  a  maxim^ 
distance  from  the  center  of  mass.  The  other  two  points  are  picked  equally  distributed  in  angular 
space  relative  to  this  ray  (see  Figure  12). 


TRANSFORMED  OBJECT 


OBJECT 

*  REFERENCE  POINTS 


HGURE  12.  REFERENCE  POINTS  USED  FOR  GAUSSIAN 
GRADIENT  COMPUTATION 


Let  the  three  points  be  designated  pl=(il,  jl),  p2=  (i2,  j2),  and  p3=(i3,  j3).  For  each  of  the 
three  points  (iX,jA,)  A,=1..3,  the  goal  is  to  compute  a  desired  correction  vector  Ar;^(t)  based  on  the 
influence  of  the  local  gradients.  The  local  gradients  contribute  a  Gaussian  weighted  term  to  the 
Ar;^^(t)  value  at  each  point  as  part  of  a  global  averaging  process.  Let  d  be  the  squared  Euclidean 
distance  between  pixels  (iX,  jX)  and  (ij),  and  let  dO  be  0.25  times  the  squared  length  of  the  diagonal 
of  the  rectangle  that  contains  the  transformed  object.  The  Gaussian  gradient  at  point  (iX,  jX),  time  t, 
and  orientation  n  is  given  by 
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rn(^>  j^»  0  —  Yijn(0  exp[-d/d0] 


Once  these  are  computed,  we  may  compute  the  Arx(t)  using  the  vector  components  of  the 
Gaussian  gradients 


X  component  of  Ar^O)  =  ^k„*iT„(U.  jl;t) 


y  component  of  Ar,(t) 


■,(0  =  jl; 


0 


n=0 


Similarly  for  Ar2(t)  and  Ar3(t).  Once  obtained,  each  of  the  Ar;^^(t)  is  normalized  separately. 


COMPUTATION  OF  NEW  TRANSFORM 

The  new  points  pi',  p2',  p3'  are  computed  using 
pX'  =  pX  +  Arx(t)  *  rstep 

where  rstep  is  the  current  step  size  being  used  in  the  gradient  descent  process. 
The  transform  T(t)  which  takes  the  pX  to  the  pX'  is  recovered  by  solving 


hi  jl  ' 

U  cl 

••  ••  1 

*1  Jl  1 

1  ^2  h  ^  1 

b  dj. 

h  h  1 

1*3  h  ^ 

h  0 

•t  •*  1 

b  J3  1 

Finally  the  new  A  transformation  at  time  t+1  is  computed 
A(t+1)  =  T(t)  o  A(t) 

where  o  indicates  right  to  left  functional  composition.  This  new  transformation  improves  the 
alignment  of  the  boundaries;  and  therefore,  the  interiors  of  the  transformed  and  reference  object. 

NETWORK  ARCHITECTURE 

The  complex  cells  for  the  four  orientations  may  be  combined  together  along  with  their 
simple  cell  building  blocks  to  produce  a  local  gradient  network  architecture  as  portrayed  in 
Figure  13.  The  network  inputs  the  pixel  responses  into  each  simple  cell  and  combines  the  simple 
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cells  together  to  form  each  complex  cell.  Finally,  the  network  fuses  the  four  complex  cell  activities 
for  each  orientation  to  produce  Ae  four  local  gr^ents  Mi,  M2,  M3,  and  M4  at  each  point 

The  Gaussian  gradient  calculation  can  also  be  performed  in  a  connectionist  manner.  Given 
the  transformation  A(t),  the  Gaussian  gradient  terms  can  be  viewed  as  connections  between  two 
fully  interconnected  pixel  planes.  With  this  convention  in  mind,  it  is  possible  to  compute  the  total 
number  of  connections  in  the  network.  Assuming  a  simple  cell  radius  of  6,  there  are  approximately 
100  interconnections  per  simple  cell.  Using  this  value,  we  can  compute  the  number  of 
interconnections  per  pixel  in  the  local  gradient  portion  of  the  network  as 

(100  intcon/sc)(4sc/cc)(4cc/orienL)(4orient)(#  of  pixels)  =  64(X) 

For  a  200  by  200  pixel  space,  this  produces  around  256  million  interconnection  for  the  local 
gradient  portion  of  the  network.  The  Gaussian  gradient  values  require  another 
(200)(200)(200)(200)  =  1.6  x  10^  interconnects  for  storage;  however,  only  (3)(200)(200)  of  these 
are  active  at  any  given  time.  This  produces  a  total  of  1.8  billion  interconnects,  of  which  256  million 
are  active  at  any  given  time. 
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LOCAL  GRADIENTS  AT  PIXEL  (IJ)  FOR  EACH  ORIENTATION 


INPUT  PIXELS  FOR 
SIMPLE  CELL 
CENTERED  AT  PIXEL  (IJ) 


HGURE  13.  NETWORK  STRUCTURE  FOR  LOCAL  GRADIENT  COMPUTATION  AT  PIXEL  (ij) 


CONVERGENCE  PROPERTIES  FOR  ONE-DIMENSIONAL  CASE 


The  COTivCTgcnce  properties  of  the  VGD  netwc»k  will  be  illustrated  with  the  analysis  of  two 
simple  one-dimensional  cases.  The  one-dimensional  general  afline  transformation  is  of  the  form 

f(x)  =  a(x)  ■¥  b,  where  a  and  b  €  R.  In  this  case,  the  affine  transformation  is  uniquely  determined 
by  its  action  on  two  points. 
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CASE  1-TRANSFORMATION  IS  SIMPLE  TRANSLATION 

Let  r^l,  0<a=p=Y«l.  Consider  the  simple  case  where  the  object  set  is  1=10,1] ,  E=[a,l+a] 
is  the  transformed  object,  and  y  <a<l-  y.  That  is,  E  is  obtained  from  I  by  the  action  of  a  pure 
translation  (to  the  right,  without  loss  of  generality)  of  I  by  less  than  the  cell  radius  r,  and  I  n  E  ^  0. 
The  region  ACR^  of  interest  (the  only  points  that  can  possibly  be  active  for  a  cell  centered  at  xe  E) 
is  A=[a-r,l+a+r]. 

There  are  five  subregions  to  consider: 

A  =  uAj  (i=l,..,5)  =  [a-r,0]  u  [0,a]  u  [a,l]  u  [1,1+a]  u  [l+a,l+a+r]. 

We  wish  to  show  that  the  total  contribution  of  restorative  cells 

Pj  =  {xe  A  :  Case  2  or  Case  3  holds) 
is  greater  than  any  possible  contribution  from  improper  motion 
P2  =  {xe  A  :  Case  1  or  Case  4  holds} 

thereby  producing  restorative  dynamics.  Case  here  refers  to  the  object-transformed  object  cases 
illustrate  in  Figure  14.  To  simplify  notation,  the  Lebesgue  measure  of  the  left  side  of  the  one 

dimensional  simple  cell  residing  in  the  set  I  will  be  denoted  L(I)  instead  of  ^.(Li).  For  xe  Aj  =  [a- 
r,0]  we  have  L(I)  =  L(E)  =  0,  ruling  out  all  four  cases.  Thus,  pi  n  Aj  =  P2  n  Aj  =  0.  Similarly, 
for  A5  =  (l+a,l+a+r],  pj  n  A5  =  P2  n  A5  =  0.For  xe  A3=[a,l]  we  have 

L(I)  =  x 
R(I)  =  1-x 
L(E)  =  x-a 
R(E)  =  l-(x-a) 

In  particular,  R(E)  >  R(I),  ruling  out  Cases  1  and  2,  and  L(I)  >  L(E),  ruling  out  Cases  3 
and  4.  Thus,  Pi  n  A3  =  P2  n  A3  =  0.  It  remains  only  to  consider  A2  =  [0,a]  and  A4  =  [1,1+a]. 
For  xe  A2,  we  have 


m)=x 
R(I)  =  1-x 
L(E)  =  0 
R(E)  =  l-(a-x) 

L(E)  =  0  rules  out  Cases  1 , 3,  and  4.  The  conditions  fw  Case  2  are 
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(i)  x>y 

(ii)  1-x  >  1  -  (a-x)  +  Y 

(iii)  1  -  (a-x)  >  Y 

(iv)  l-x>x-f-Y 

(i)  and  (iii)  are  satisfied  for  \>y.  (ii)  and  (iv)  imply  (x  <  1/2  -  y/2)  n  (x  <  a/2  -  y/2).  That 
is,  Case  2  is  satisfied  for 

=  {x;  Y<x<min(l/2- Y/2,a/2-Y/2)} 

For  xe  A4,  .^e  have  (writing  x  =  1+y,  y€  [0,a]) 


La)  =  i-y 
R(I)  =  0 
L(E)  =  l-(a-y) 

R(E)  =  a-y 

R(I)  =  0  rules  out  Cases  1 , 2,  and  4.  The  conditions  for  Case  3  imply  (a-y  <  1/2  -  y  /2)  n  (y 
>  a/2  +  y/2),  or  Case  3  is  satisfied  for 

A2  =  (y  :  max(a  - 1/2  +•  y/2  ,  a/2  +  y/2)  <  y  <  a  -  y  } 

=  {x  :  max(a  +  1/2  +  y/2, a/2  +  1  +  y/2)  <x  <a  +1  -  y  ) 

In  summation,  then,  we  have  Pi  n  [A2  A4]  >  P2  ^  [A2  A4]  =  0.  Thus  we  have  net 

restorative  action,  as  desired.  For  convergence  (that  is,  E  ->  tfcg),  it  suffices  then  to  consider  the 
step  size  ->  0,  with  2  n  s„  = «».  For  recovery  of  transformation,  we  may  consider  having  chosen 

one  point  from  each  of  A,,  A2.  That  is,  x,€  Ai,  X2€  A2  will  both  yield  restorative  action.  Hence, 
overall  action  will  be  translation  to  the  left  as  desir^.  An  obvious  choice,  a  priori,  for  Xj,  X2  is  X|=a, 
X2=l+a.  This  choice  will  assure  the  maximum  Gaussian  contribution  to  the  restorative  force  from 
those  points  in  A]  and  A2- 
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LOCAL  VIEWPOINT 


GLOBAL  VIEWPOINT 


CASE1  ^ J  I 


[m(^  )  -  mO-e)  >  PI  A  ^  ^  ^  ^  ^ 


CASE  2  ^  -4t - f 


I 

E 


lM(^ )  -  MO-g)  >  P]  A  IWFy  -  ^(Rg)  >  PI  A  [M(Rg)  -  M(Lg)  >  a]  A  OjO^P  -  mO-,)  >  a] 


CASE3  £■  *  J 


WLg)  -  mOt)  >  PI  A  Wg)  -  M(Rp  >  PI  A IW4  -  M(R,)  >  a]  A  IWLg)  -  M(Re)  >  a] 


*  I 

E 


CASE4  C  *  *3 


1^)  -  p(L^  >  p]  A 1^)  -  M(R,)  >  p]  A  IWRp  -  >  cq  A I^)  -  M(Lg)  >  a] 

HGURE  14.  ONE-DIMENSIONAL  OBJECT  TRANSFORMED  OBJECT  CASES 


CASE  2-TRANSFORMATION  IS  SIMPLE  SCALING 

Again,  let  r=l,  0<a=P=Y«l.  Consider  i=[0,l],  E=[0,a],  y  <a<l-  y.  That  is,  E  is  a  simple 
scaling  of  I.  Then,  as  before,  we  write  the  region  of  interest  A  as  A  =  [-r,a+r]  =  uAj  (i=l,.„4). 
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where  Ai=[-r,0],  A2=[0,a],  A3=[a,l],  and  A4=[l,l+r].  Similar  to  above,  we  obtain  pi  n  Aj  =  pi  n 
^4  =  P2  Aj  =  p2  n  A4  -  0  Thus  it  remains  to  consider  A2  and  A4.  For  xe  A2,  L(I)  =  L(E)  =  x, 
ruling  out  all  four  cases,  and  pj  n  A2  =  P2  A2  =  0.  For  xe  A3,  we  have 


L(I)  =  x 
RO)  =  1-x 
L(E)  =  a 
R(E)  =  0 

Case  1  is  satisfied  fev 


A3=  {x:max(l/2  +  Y/2,a  +  Y)<x<  1  -Yl 

Thus,  P2  o  A3  >  Pi  n  A3  =  0,  and  we  have  net  restorative  effect.  Convergence  then 
requires  only  Ae  conthtions  on  the  step  size  s„  noted  above.  For  transformation  recovery,  we  see 
that  Xi=0,  X2=a  again  yield  the  required  dynamics. 

For  a  case  in  which  we  have  both  scaling  and  translation,  the  translation  effects  align  the 
objects  (to  within  y  of  perfect  alignment),  then  the  scaling  takes  place,  and  convergence  is 
maintained. 


RESULTS 


The  capability  of  the  VGD  network  to  recover  an  unknown  affine  transformation  can  best 
be  illustrated  with  a  simple  example.  The  results  that  are  presented  here  were  produced  by  a  serial 
implementation  of  the  VGD  network  running  on  a  Silicon  Graphics  4D/220.  C>ne  would  expect  the 
results  produced  by  an  analog  implementation  of  the  netwt^  architecture  to  be  similar. 

In  the  standard  ATR  process,  a  transformation  is  sought  that  optimally  aligns  a  reference 
nKxlel  with  a  copy  of  itself.  In  this  example,  the  equivalent  problem  of  aligning  the  transformed 
copy  with  the  reference  nnodel  is  solved.  Given  a  copy  of  a  square  that  has  been  subjected  to  an 
unknown  affine  transfexmatiem,  we  wish  to  solve  for  the  inverse  transform. 

Figure  15  portrays  the  50  steps  needed  by  the  VGD  process  to  recover  the  transformation. 
The  upper  left  comer  of  the  figure  represents  the  initial  configuration  of  the  square  and  its 
transfonned  copy.  The  salient  simulation  parameters  for  the  run  are  summarized  in  Table  1.  The 
first  20  iterations  act  to  align  the  centers  of  the  two  images.  The  last  30  iterations  scale  the 
transformed  copy  and  rotate  it  into  position  over  the  reference  image.  After  50  iterations,  the 
2  images  are  aligned  to  a  tolerance  of  1  pixel.  The  number  of  steps  required  for  convergence  is 
small  compared  to  the  500  or  so  needed  steps  required  by  a  random  search  technique  such  as 
generalized  simulated  annealing  (Reference  5). 
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INITIALIZATION 


10  ITERATIONS 


20  ITERATIONS  30  ITERATIONS 


‘VnTRATONS . 


’sOITC^TioNS 


FIGURE  15.  VGD  RESULTS  FOR  SQUARE 
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TABLE  1.  SIMULATION  PARAMETERS 


CHARACTERISTIC 

MEASUREMENT 

Image  Limits 

-.05  to  1.0 

Image  Size 

200  X  200  pixels 

Simple  Cell  Radius 

5  pixels 

Cell  Threshhold 

3  pixels 

Step  Size 

.005  (1  pixel) 

CONCLUSION 


The  VGD  networic  provides  a  new  technique  to  rapidly  recover  an  unknown  transformation 
relating  two  objects.  Depending  on  the  type  of  transformation,  the  VGD  network  has  direct 
applications  to  two-  and  three-dimensional  ATR  and  to  image  compression  using  IPS.  Although 
not  impervious  to  local  minima,  the  rapid  convergence  of  this  guided  technique  offers  clear 
advantage  over  random  search  techniques  such  as  simulated  annealing  and  genetic  algorithms. 

The  authors  are  continuing  work  on  the  use  of  the  VGD  network.  Current  work  includes  a 
parallel  version  of  the  VGD  netwoik  that  covers  an  image  with  multiple  copies  of  itself,  a  model- 
based  6  DoF  ATR  system  employing  both  generalized  simulate  annealing  and  VGD  and  an  analog 
implementation  of  the  VGD  network.  All  of  these  efforts  provide  further  evidence  of  the  utility  and 
power  of  the  generic  VGD  architecture. 
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